MSQ-Index: A Succinct Index for Fast Graph Similarity Search
نویسندگان
چکیده
Graph similarity search under the graph edit distance constraint has received considerable attention in many applications, such as bioinformatics, data mining, pattern recognition and social networks. Existing methods for this problem have limited scalability because of huge amount memory they consume when handling very large databases with tens millions graphs. In article, we present a succinct index that incorporates structures hybrid encoding to achieve improved query time performance minimal space usage. Specifically, usage our requires only 5-15 percent previous state-of-the-art indexing size while at same achieving several times acceleration on tested data. We also improve by augmenting global filter range searching, which allows us perform reduced region. addition, propose two effective lower bounds together boosting technique obtain smallest possible candidate set. Extensive experiments demonstrate proposed approach is superior both filtering approaches. To best knowledge, first in-memory successfully scales cope dataset 25 million chemical structure graphs from PubChem dataset. The source code available online.
منابع مشابه
MSQ-Index: A Succinct Index for Fast Graph Similarity Search
Graph similarity search has received considerable attention in many applications, such as bioinformatics, data mining, pattern recognition, and social networks. Existing methods for this problem have limited scalability because of the huge amount of memory they consume when handling very large graph databases with millions or billions of graphs. In this paper, we study the problem of graph simi...
متن کاملMLR-Index: An Index Structure for Fast and Scalable Similarity Search in High Dimensions
High-dimensional indexing has been very popularly used for performing similarity search over various data types such as multimedia (audio/image/video) databases, document collections, time-series data, sensor data and scientific databases. Because of the curse of dimensionality, it is already known that well-known data structures like kd-tree, R-tree, and M-tree suffer in their performance over...
متن کاملAshwini Index of a Graph
Motivated by the terminal Wiener index, we define the Ashwini index $mathcal{A}$ of trees as begin{eqnarray*} % nonumber to remove numbering (before each equation) mathcal{A}(T) &=& sumlimits_{1leq i
متن کاملBounds for the Co-PI index of a graph
In this paper, we present some inequalities for the Co-PI index involving the some topological indices, the number of vertices and edges, and the maximum degree. After that, we give a result for trees. In addition, we give some inequalities for the largest eigenvalue of the Co-PI matrix of G.
متن کاملIndex-Supported Similarity Search Using Multiple Representations
Similarity search in complex databases is of utmost interest in a wide range of application domains. Often, complex objects are described by several representations. The combination of these different representations usually contains more information compared to only one representation. In our work, we introduce the use of an index structure in combination with a negotiation-theorybased approac...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Transactions on Knowledge and Data Engineering
سال: 2021
ISSN: ['1558-2191', '1041-4347', '2326-3865']
DOI: https://doi.org/10.1109/tkde.2019.2954527